google robot
VideoVLA: Video Generators Can Be Generalizable Robot Manipulators
Shen, Yichao, Wei, Fangyun, Du, Zhiying, Liang, Yaobo, Lu, Yan, Yang, Jiaolong, Zheng, Nanning, Guo, Baining
Generalization in robot manipulation is essential for deploying robots in open-world environments and advancing toward artificial general intelligence. While recent Vision-Language-Action (VLA) models leverage large pre-trained understanding models for perception and instruction following, their ability to generalize to novel tasks, objects, and settings remains limited. In this work, we present VideoVLA, a simple approach that explores the potential of transforming large video generation models into robotic VLA manipulators. Given a language instruction and an image, VideoVLA predicts an action sequence as well as the future visual outcomes. Built on a multi-modal Diffusion Transformer, VideoVLA jointly models video, language, and action modalities, using pre-trained video generative models for joint visual and action forecasting. Our experiments show that high-quality imagined futures correlate with reliable action predictions and task success, highlighting the importance of visual imagination in manipulation. VideoVLA demonstrates strong generalization, including imitating other embodiments' skills and handling novel objects. This dual-prediction strategy - forecasting both actions and their visual consequences - explores a paradigm shift in robot learning and unlocks generalization capabilities in manipulation systems.
Google robot can have a conversation but also fetch you a snack
A robot from Google has achieved a level of wide-ranging capability that hasn't been seen before. It can converse with you like a chatbot, answer questions about pictures and even get the right snacks for you from a drawer. The robot uses a version of a language model called PaLM, which Google researchers first created last year.
An Impressive Walking Google Robot Tries to Vacuum the Stairs
These strange-looking, two-legged robots might be the predecessor of a machine that someday helps with chores around the home. The bipedal bot, which has yet to be named, was developed by Schaft, a Japanese robotics company that is part of X, the research lab owned by Alphabet (previously Google). It was revealed at an event in Japan hosted by Andy Rubin, who started Google's robotics project before leaving the company at the end of 2014 to create his own hardware incubator. A video shot by someone at the event shows the robot carrying a heavy-looking gym weight, slipping on a tube without falling over, and cleaning a set of stairs with a vacuum cleaner brush attachment on its feet. It can also be seen walking through a forest and along a rocky beach.